Scan system interface (ssi) module

ABSTRACT

A method for testing. The method includes sending a single instruction over a JTAG interface to a JTAG controller to select a first internal test data register of a plurality of data registers. The method includes programming the first internal test data register using the JTAG interface to configure mode control access and state control access for a test controller implementing a sequential scan architecture to test a chip at a system level.

RELATED APPLICATIONS

This application claims priority to and the benefit of following applications:

-   -   U.S. Provisional Application Ser. No. 62/247,195, with Attorney         Docket No. NVID-P-SC-15-0129-US01A, filed on Oct. 27, 2015,         entitled “SCANS SYSTEMS AND METHODS”; and     -   U.S. Provisional Application Ser. No. 62/285,429, with Attorney         Docket No. NVID-P-SC-15-0129-US01B, filed on Oct. 27, 2015,         entitled “SCANS SYSTEMS AND METHODS”;         which are all hereby incorporated by reference in their entirety         for all intents and purposes.

BACKGROUND

Chips including semiconductor integrated circuits undergo a variety of tests to determine whether the semiconductor devices are operating properly. There are various stages of testing to screen defective and/or underperforming chips to avoid the cost of passing along a bad chip onto the next level of assembly. For example, the various stages of testing during each level of assembly include wafer level testing, package level testing, board level testing and system level testing. The quicker a bad chip is discovered, less wasted cost is incurred since it is removed from the assembly chain. That is, the amount of loss due to a bad chip at wafer level is lower than the amount of loss due to bad part at system level due to waste of material and efforts spent at subsequent stages of building a system and processes. So it is essential to screen the parts at each level.

In particular, scan based tests of the circuits may be performed to test one or more similarly configured chips. Scan based tests of circuits on a chip include “scan shift” and “scan capture” operations. These scan based tests can operate on a scan chain of connected registers (e.g., flip-flops or latches) that are designed for testing by inputting data and analyzing the output data from each of the scan chains.

During production level testing on ATE, automatic test pattern generator (ATPG) test patterns are typically run to screen the bad chips from good chips. ATPG test patterns are mainly run on automatic test equipment (ATE) during production testing at wafer level or/and package level to test chips in parallel.

It is desirable to run system level testing to perform routine maintenance, and perform failure testing. In a practical example, chips may be used in the infotainment system of automobiles. The need to run online system level tests of chips already integrated into the infotainment system at the system level (i.e., after the automobile is ready for consumer purchase) is mandatory in the industry. For instance, it is necessary to perform fault diagnosis and testing during the maintenance of the automobile system. On-line testing and diagnostics may follow industry standards, such as the functional safety standard for automobiles (e.g., ISO 26262) outlining functional safety features at each phase of product development for automobiles. On-line testing and diagnostics may be performed to determine failure in time (FIT) rates, reliability grading, and resiliency grading for mission critical applications. In addition, system level testing may be performed on a field return part, wherein a chip which passes production testing incurs a failure when implemented into a system. As such, it is necessary to support online testing and diagnostics in automobile applications for these specialized chips.

The biggest problem is that ATPG test patterns are difficult to implement at the system level, such as when performing online logic testing. System level testing heretofore included running scan debug tests, wherein all test scan chains are stitched into one single, long chain. The combined scan chain is driven from a test (e.g., TCK) clock. However, this scan debug test seems to be very slow because of the large number of flops in the chains that need to be loaded and unloaded. Also, ATPG test patterns cannot be directly applied in scan debug mode. Further, the additional infrastructure needed on a chip to support a scan debug mode is costly, and cannot be accommodated within the tight confines of the automobile cabin. As such, a scan debug mode is not feasible for performing system level testing.

Also, customers would like to run self-test patterns, such as logic built in self test (BIST) during power-on at the system level to made sure that the chip is still fully functional, before entering into mission mode. However, logic BIST test approaches are different depending on the chip and the electronic design automation (EDA) tool support. In most cases, the logic BIST does not provide higher test coverage because of the random patterns. As such, it is not suitable or economical to implement logic BIST for the purpose of running online system level testing.

SUMMARY

It is desirable to have system level testing of chips using ATPG test patterns.

In embodiments of the present invention, a method for testing chips using a System Scan Interface (SSI) to enable online logic testing at system level is disclosed. The method includes sending a single instruction over a joint test action group (JTAG) interface to a JTAG controller to select a first internal test data register of a plurality of data registers. The method further includes programming the first internal test data register using the JTAG interface to configure mode control access and state control access for a test controller implementing a sequential scan architecture to test a chip at a system level.

In another embodiment, a computer system is described, wherein the computer system includes a processor, and memory coupled to the processor and having stored therein instructions that, if executed by the computer system, cause the computer system to execute a method for testing. The method includes sending an instruction to a JTAG controller to select a first internal test data register of a plurality of data registers. The method also includes programming the first internal test data register to configure mode control access and state control access for a test controller implementing a sequential scan architecture at a system level.

In still another embodiment, a non-transitory computer-readable medium having computer-executable instructions for causing a computer system to perform a method for testing is described. The method includes sending an instruction to a JTAG controller to select a first internal test data register of a plurality of data registers. The method also includes programming the first internal test data register to configure mode control access and state control access for a test controller implementing a sequential scan architecture at a system level.

These and other objects and advantages of the various embodiments of the present disclosure will be recognized by those of ordinary skill in the art after reading the following detailed description of the embodiments that are illustrated in the various drawing figures.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and form a part of this specification and in which like numerals depict like elements, illustrate embodiments of the present disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 depicts a block diagram of an exemplary computer system suitable for implementing embodiments according to the present disclosure.

FIG. 2 is a block diagram illustrating a system scan interface in a ultra-fast-interface (UFI) module, in accordance with one embodiment of the present disclosure.

FIG. 3 is a block diagram illustrating the implementation of an SSI mode access for performing sequential scan compression, in accordance with one embodiment of the present disclosure.

FIG. 4 is a flow diagram illustrating a method for system level testing of a chip using a system scan interface, in accordance with one embodiment of the present disclosure.

FIG. 5 is a block diagram illustrating the use of a SYSTEM_UFI_FSM_CTRL register when performing system level testing of a chip using a system scan interface, in accordance with one embodiment of the present disclosure.

FIG. 6 is a block diagram illustrating the clocking used for a system scan interface in a sequential scan architecture, in accordance with one embodiment of the present disclosure.

FIG. 7A is a diagram of SSI mode generation when performing sequential scan compression during system level testing of a chip, in accordance with one embodiment of the present disclosure.

FIG. 7B is a diagram of SSI state generation when performing sequential scan compression during system level testing of a chip, in accordance with one embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE INVENTION

Reference will now be made in detail to the various embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings. While described in conjunction with these embodiments, it will be understood that they are not intended to limit the disclosure to these embodiments. On the contrary, the disclosure is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the disclosure as defined by the appended claims. Furthermore, in the following detailed description of the present disclosure, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be understood that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present disclosure.

Accordingly, embodiments of the present invention provide for implementing a scan compression architecture for online logic testing at the system level. Further, embodiments of the present invention provide the above advantages and also provide for running ATPG test patterns designed for production level testing at the system level.

Throughout this application, the term “SoC” may be analogous to the term “chip,” both defining an integrated circuit implemented on a single chip substrate. It may contain components of a computing system or other electronic system. In addition, the term “logic block” defines a specialized circuit design that performs one or more specific functions. The logic block may be integrated, in part, with other logic blocks to form an SoC. In addition, the term “logic block” may be analogous to the term “chiplet” or “design module.”

Some portions of the detailed descriptions that follow are presented in terms of procedures, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. In the present application, a procedure, logic block, process, or the like, is conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those utilizing physical manipulations of physical quantities. Usually, although not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as transactions, bits, values, elements, symbols, characters, samples, pixels, or the like.

It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present disclosure, discussions utilizing terms such as “generating,” “supplying,” “configuring,” “dividing,” “scanning,” or the like, refer to actions and processes (e.g., in flowchart 2 of the present Application) of a computer system or similar electronic computing device or processor (e.g., computer system 100 of FIG. 1). The computer system or similar electronic computing device manipulates and transforms data represented as physical (electronic) quantities within the computer system memories, registers or other such information storage, transmission or display devices.

Other embodiments described herein may be discussed in the general context of computer-executable instructions residing on some form of computer-readable storage medium, such as program modules, executed by one or more computers or other devices. By way of example, and not limitation, computer-readable storage media may comprise non-transitory computer storage media and communication media. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or distributed as desired in various embodiments.

Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, random access memory (RAM), read only memory (ROM), electrically erasable programmable ROM (EEPROM), flash memory or other memory technology, compact disk ROM (CD-ROM), digital versatile disks (DVDs) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can accessed to retrieve that information.

Communication media can embody computer-executable instructions, data structures, and program modules, and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared and other wireless media. Combinations of any of the above can also be included within the scope of computer-readable media.

FIG. 1 is a block diagram of an example of a computing system 100 capable of implementing embodiments of the present disclosure. Computing system 100 broadly represents any single or multi-processor computing device or system capable of executing computer-readable instructions. Examples of computing system 100 include, without limitation, workstations, laptops, client-side terminals, servers, distributed computing systems, handheld devices, gaming systems, gaming controllers, or any other computing system or device. In its most basic configuration, computing system 100 may include at least one processor 105 and a system memory 110.

It is appreciated that computer system 100 described herein illustrates an exemplary configuration of an operational platform upon which embodiments may be implemented to advantage. Nevertheless, other computer system with differing configurations can also be used in place of computer system 100 within the scope of the present invention. That is, computer system 100 can include elements other than those described in conjunction with FIG. 1. Moreover, embodiments may be practiced on any system which can be configured to enable it, not just computer systems like computer system 100. It is understood that embodiments can be practiced on many different types of computer systems 100. System 100 can be implemented as, for example, a desktop computer system or server computer system having a power general-purpose CPUs coupled to a dedicated graphics rendering GPU. In such an embodiment, components can be included that add peripheral buses, specialized audio/video components, I/O devices, and the like. Similarly, system 100 can be implemented as a handheld device (e.g., cell phone, etc.) or a set-top video game console device, such as, for example Xbox®, available from Microsoft corporation of Redmond, Wash., or the PlayStation3®, available from Sony Computer Entertainment Corporation of Tokyo, Japan, or the any of the SHIELD Portable devices (e.g., handheld gaming console, tablet computer, television set-top box, etc.) available from Nvidia Corp. System 100 can also be implemented as a “system on a chip”, where the electronics (e.g., the components 105, 110, 115, 120, 125, 130, 150, and the like) of a computing device are wholly contained within a single integrated circuit die. Examples include a hand-held instrument with a display, a car navigation system, a portable entertainment system, and the like.

In the example of FIG. 1, the computer system 100 includes a central processing unit (CPU) 105 for running software applications and optionally an operating system. Memory 110 stores applications and data for use by the CPU 105. Storage 115 provides non-volatile storage for applications and data and may include fixed disk drives, removable disk drives, flash memory devices, and CD-ROM, DVD-ROM or other optical storage devices. The optional user input 120 includes devices that communicate user inputs from one or more users to the computer system 100 and may include keyboards, mice, joysticks, touch screens, and/or microphones.

The communication or network interface 125 allows the computer system 100 to communicate with other computer systems via an electronic communications network, including wired and/or wireless communication and including the internet. The optional display device 150 may be any device capable of displaying visual information in response to a signal from the computer system 100. The components of the computer system 100, including the CPU 105, memory 110, data storage 115, user input devices 120, communication interface 125, and the display device 150, may be coupled via one or more data buses 160.

In the embodiment of FIG. 1, a graphics system 130 may be coupled with the data bus 160 and the components of the computer system 100. The graphics system 130 may include a physical graphics processing unit (GPU) 135 and graphics memory. The GPU 135 generates pixel data for output images from rendering commands. The physical GPU 135 can be configured as multiple virtual GPUs that may be used in parallel (concurrently) by a number of applications executing in parallel.

Graphics memory may include a display memory 140 (e.g., a frame buffer) used for storing pixel data for each pixel of an output image. In another embodiment, the display memory 140 and/or additional memory 145 may be part of the memory 110 and may be shared with the CPU 105. Alternatively, the display memory 140 and/or additional memory 145 can be one or more separate memories provided for the exclusive use of the graphics system 130.

In another embodiment, graphics processing system 130 includes one or more additional physical GPUs 155, similar to the GPU 135. Each additional GPU 155 may be adapted to operate in parallel with the GPU 135. Each additional GPU 155 generates pixel data for output images from rendering commands. Each additional physical GPU 155 can be configured as multiple virtual GPUs that may be used in parallel (concurrently) by a number of applications executing in parallel. Each additional GPU 155 can operate in conjunction with the GPU 135 to simultaneously generate pixel data for different portions of an output image, or to simultaneously generate pixel data for different output images.

Each additional GPU 155 can be located on the same circuit board as the GPU 135, sharing a connection with the GPU 135 to the data bus 160, or each additional GPU 155 can be located on another circuit board separately coupled with the data bus 160. Each additional GPU 155 can also be integrated into the same module or chip package as the GPU 135. Each additional GPU 155 can have additional memory, similar to the display memory 140 and additional memory 145, or can share the memories 140 and 145 with the GPU 135.

Further, graphics system 130 may include sequential scan architecture 170 that is configured for using the system scan interface for enabling online testing of a SoC at the system level.

FIG. 2 is a block diagram illustrating a system scan interface in a ultra-fast-interface (UFI) module 200, in accordance with one embodiment of the present disclosure. In particular, the simplified diagram of the UFI module 200 is used to drive the centralized test controller 310 of FIG. 3, which performs the sequential scan compression in the sequential scan architecture used for testing a chip at system level. In particular, embodiments of the present invention provide for using SSI to enable online logic testing at the system level. For example, sequential scan compression is implemented to achieve testing under reduced pin counts and improved test pattern quality when performing testing of a chip.

As shown in FIG. 2, the UFI module 200 includes a UFI state machine 210, and SSI registers 220. The SSI registers 220 are used to reset, read, and write the controls signals required for the UFI module 200.

In particular, UFI module 200 can be implemented in two modes: (a) UFI mode for production testing using an ATE and, (b) SSI mode for system level testing. In particular, in the UFI mode, controls signals 250 are generated using an external UFI scan pin (not shown). On the other hand, in the SSI mode, control signals 260 are generated using JTAG/1500 interface (not shown), in one embodiment.

UFI module 200 includes a JTAG or 1500 test data register (TDR) to write/read UFI controls in JTAG mode. Table 1 shows control signals for the SYSTEM_UFI_FSM register 220.

TABLE 1 FIELD MNEMONIC RESET R/W DESCRIPTION 0 jtag_ufi_mode0 0 R/W SSI Mode Select 0 0 jtag_ufi_mode1 0 R/W SSI Mode Select 1 0 jtag_ufi_mode2 0 R/W SSI Mode Select 2 0 jtag_ufi_mode3 0 R/W SSI Mode Select 3 0 jtag_ufi_ir_dr 0 R/W SSI Mode or State

FIG. 3 is a block diagram illustrating the implementation of SSI mode access for performing sequential scan compression in a sequential scan compression architecture 300, in accordance with one embodiment of the present disclosure. Sequential scan compression is mainly used for production testing of chips on ATE. However, in embodiments of the present invention sequential scan compression is used to translate ATPG test patterns, used during production testing, for use in an SSI mode of the sequential scan compression architecture 300. As such, the ATPG test patterns may be applied during system level testing of chips.

To run the sequential scan compression architecture 300 in SSI mode requires an implementation of SSI interface to control the centralized test controller 310 in JTAG mode. In IEEE 1500 IP based flow, various wrapper data registers are created to send control/read/write scan compression control signals. That is, the system scan interface is used to communicate control signals with the test controller 310 using the JTAG interface. The UFI module 200 is operating in a JTAG mode when the sequential scan compression architecture 300 is in SSI mode.

The SSI mode can be accessed during system level testing of a chip, wherein the SSI mode follows the IEEE 1500 IP based flow, in one embodiment. As such, IEEE 1500 wrapper data registers are created for controlling read and writes, and to send various mode and state control signals to the test controller 310. Those mode and state control signals are then communicated to sequential scan compression CODECs, such as sequential scan decompressor 370 and sequential scan compressor 380.

As previously introduced, the UFI module 200 is used to control the mode and state of the test controller 310. For example, when performing debug and bring-up of engineering samples during testing, there may be a need to read the mode and state values of the UFI module 200. This is achieved using the JTAG interface. During normal operation of sequential scan compression of the UFI module 200 operating in a JTAG mode for system level testing, the mode and state control signals need to be controlled using the IEEE 1500 cluster (not shown). In turn, the IEEE 1500 cluster is controlled using JTAG ports (not shown).

Further, the JTAG ports are used to dynamically access the various scan chains 390A-N involved in the sequential scan compression architecture 300. Previously, access of the scan chains involved using an “instruction register” to enable the specific chain that will be accessed. A “data register” is used to access (e.g., read or write) that specific scan chain. However, the use of the instruction register and data register involves too many cycles because a new instruction is needed for every access to a different data register. This detrimentally adds to the cycle overhead because there are many instructions written to one or more instruction registers for accessing a chip (e.g., during testing).

On the other hand, embodiments of the present invention are configured to control sequential scan compression logic using the JTAG ports. Specifically, input data can be delivered to the various scan chains using the JTAG ports. Also, response data can be observed using the JTAG ports. Further, configuring the sequential scan compression logic architecture for testing only involves a single instruction, as will be described further below.

In one embodiment, the multiple input shift register (MISR) is used for storing data (e.g., compressed test pattern) on the chip. In one implementation, during MISR unload, the MISR scan out is multiplexed (e.g., using MUX) with the Wrapper Scan Out (WSO) to TDO using an additional instruction.

Also, in the IP based flow while in ATE mode, the dynamic standard test access (DSTA) load and unload modules 350 are used to accommodate fewer than available pins at the chip level for a specific SoC. In production testing or ATE mode, test controller 310 control inputs are driven from the UFI module 360. In particular, the aforementioned UFI_FSM 1500 Wrapper data registers 305 are reset using WRSTN in the ATE mode, and default programmed values will bring those registers into the production testing mode. Also, the UFI_FSM_CTRL register 305 is used to read mode/state control signals that are driven by the UFI module 360 in the ATE mode. The UFI_FSM_CTRL register 305 may be programmed during test-setup.

On the other hand, switching to online system level testing is achieved using the JTAG ports. In particular, the UFI_FSM 1500 Wrapper data registers 305, used for ATE mode, need to be programmed to enable online system level logic testing (e.g., enabling ufi_mode to 0), which also disables the production level ATE test mode. In SSI mode, the multiplexor 320 is configured to select the proper MUX input, which will drive the test controller 310 using control inputs obtained from the SYSTEM_UFI_FSM test data register 307. In particular, during SSI mode, the test controller 310 control inputs 260 are driven from a IEEE 1500 SYSTEM_UFI_FSM WDRs. In this manner, mode and state control inputs are used drive the test controller 310. These data registers (e.g., UFI_FSM 1500 Wrapper data registers 305 and SYSTEM_UFI_FSM test data register 307) are located in the same partition where the UFI module 200 and the centralized test controller 310 are integrated, in one embodiment. Table 2 lists the various DSTA modes.

TABLE 2 DSTA MODE DSTA PINS COMMENTS 4X 6 Basic DSTA Mode 6X 4 Flexible DSTA Mode 8X 3 Flexible DSTA Mode 12X  2 Flexible DSTA Mode 24X  1 Single Pin Mode for SSI

FIG. 5 is a block diagram illustrating the use of a SYSTEM_UFI_FSM_CTRL register 307 in the sequential scan architecture 300 of FIG. 3 when performing system level testing of a chip using a system scan interface, in accordance with one embodiment of the present disclosure.

In particular, the sequential scan architecture 300 uses the centralized test controller 310 and codecs (e.g., PRPG as sequential decompressor 370, and MISR as sequential compressor 380) for implementing SSI mode. As previously described, the test controller 310 is controlled from a set of mode signals and state control signals. The SYSTEM_UFI_FSM_CTRL register 307 is selected during test setup using a single instruction. The patterns stored in the SYSTEM_UFI_FSM_CTRL register 307 are used to derive mode and state control signals.

In particular, putting the SYSTEM_UFI_FSM_CTRL register 307 into the TDR mode used for system level testing only requires a single instruction. In that manner, the system_ufi_ir_dr bit is set to 1. This allows for single Instruction access to the SYSTEM_UFI_FSM_CTRL register 307 at the beginning, and throughout implementation of system level testing. Mode and state control will be implemented by using the SYSTEM_UFI_FSM_CTRL register 307, a register in the plurality of test data registers, depending on the data written into that register 307.

In one embodiment, while the SYSTEM_UFI_FSM_CTRL register 307 is in TDR mode, it will be used in ping-pong fashion to control modes and states alternately. In the ping-pong implementation, in one cycle the SYSTEM_UFI_FSM_CTRL register 307 is used for mode control, and in the next cycle the SYSTEM_UFI_FSM_CTRL register 307 is used for state control. These cycles are repeated.

For example, upon “Reset” (e.g., after a STATE_WRITE phase, or upon initiation of the register 307 for testing), the SYSTEM_UFI_FSM_CTRL register 307 acts as a mode register. In the next access, the SYSTEM_UFI_FSM_CTRL register 307 is used for state control. In the ping-pong fashion, the SYSTEM_UFI_FSM_CTRL register 307 alternates between mode access and state access repeatedly (e.g., mode access, state access, mode access, state access, etc.).

Specifically, mode control signals are decoded based on mode values written during the MODE_WRITE phase to the SYSTEM_UFI_FSM_CTRL register 307. In this phase, the “jtag_ufi_ir_dr” is set to “1”. The “jtag_ufi_ir_dr” bit will not allow writes to this register when set to “0”, such as when the SYSTEM_UFI_FSM_CTRL register 307 is operating in a state control phase (e.g., STATE_WRITE phase). In this case, the existing mode bits will be the same as when written in the previous MODE_WRITE. During the STATE_WRITE phase of operation for the SYSTEM_UFI_FSM_CTRL register 307, the state control signals (e.g., captureDR, shiftDR, updateDR signals) will be used to derived JTAG states during the STATE_WRITE phase.

The ping-pong access of the SYSTEM_UFI_FSM_CTRL test data register 307 will allow the SYSTEM_UFI_FSM_CTRL register 307 to be used for mode control and state control without accessing any other instruction register. That is, based on the values and configuration stored in SYSTEM_UFI_FSM_CTRL register 307 will determine whether the control signals are used for mode control or state control to drive the test controller 310.

Table 3 shows details of the SYSTEM_UFI_FSM_CTRL register 307. In particular, when the SYSTEM_UFI_FSM_CTRL register 307 is selected, the test data input (TDI) pin will be connected to various scan chain inputs. In addition, test data output (TDO) will be connected to the scan chain outputs, depending on the mode control signals.

TABLE 3 PORT TYPE DESCRIPTION WSI input Wrapper Serial Input WSO output Wrapper Serial Output WRCK input Wrapper Input Clock WRSTN input Wrapper Reset captureWR input Wrapper captureDR shiftWR input Wrapper shiftDR updateWR input Wrapper update DR jtag_ufi_ir_dr Output SSI IR or DR selection jtag_ufi_mode3 Output SSI UFI mode bit 3 jtag_ufi_mode2 Output SSI UFI mode bit 2 jtag_ufi_mode1 Output SSI UFI mode bit 1 jtag_ufi_mode0 Output SSI UFI mode bit 0

FIG. 5 is a block diagram illustrating the use of a SYSTEM_UFI_FSM_CTRL register 307 when performing system level testing of a chip using a system scan interface, in accordance with one embodiment of the present disclosure. A sequence of operations and use of the SYSTEM_UFI_FSM_CTRL register 307 for both mode control and state control in the sequential scan architecture 300 of FIG. 3 is described below in relation to FIG. 5.

During JTAG reset, the SYSTEM_UFI_FSM_CTRL register 307 is initialized to 5′b00000, in one embodiment. For example, register 307 is initialized at the beginning of system level testing, and after each STATE_WRITE phase.

During mode control access, when the MSB of the SYSTEM_UFI_FSM_CTRL register 307 is programmed to “0”, this allows access to register 307. In particular, use will write the MSB of register 307 to 1′b1, assuming that the next access will be state control access (e.g., ping-pong fashion). In one implementation, when the MSB of SYSTEM_UFI_FSM_CTRL register 307 is programmed to “1,” and mode2, mode 1, and mode0 are programmed to intended mode controls, this combination of mode bits will give the corresponding mode control test (e.g., selected between 8 mode tests based on a 3 bit value) and an associated update IR. In one implementation, mode3 is reserved for a later purpose (e.g., increasing the number of mode control tests).

In earlier mode control access, the MSB of the SYSTEM_UFI_FSM_CTRL register 307 is programmed to 1′b1. This allows state control signals to be used to derive TLR, RTI, CaptureDR, ShiftDR, exitlDR, UpdateDR, during state control access. This sequence is used by sequential scan compression test controller 310 of FIG. 3 to access sequential scan compression chains dynamically. These various chains will be considered as shift registers during this access. For example, during state control access, “jtag_ufi_fsm_scanin” and “jtag_ufi_fsm_canout” are available to send “scanin” data to sequential decompression codec 370 from TDI, and observed misr unload data is outputted from the sequential compression codec 380 to TDO.

At the end of the test sequence of the sequential scan compression procedure during the state control access, “call update DR” will reset the MSB of the SYSTEM_UFI_FSM_CTRL register 307, so that register 307 is again ready for mode access.

In one embodiment, during SYSTEM UFI mode or SSI mode to implement system level testing, the Dynamic Standard Test Access (SERDES) will be configured in 24× mode. As such, wait cycles are added during RTI to make sure the mode and state control signals transferred over a JTAG interface over a relatively fast JTAG clock domain are correctly transferred to the slower clock domain (e.g., if clock) used within the sequential scan architecture 300.

FIG. 6 is a block diagram illustrating the clocking used for a system scan interface in a sequential scan architecture 600, in accordance with one embodiment of the present disclosure. For SSI mode is it required to prepare the testing procedures (e.g., xtr procedures including sequential scan compression) according SYSTEM_UFI_FSM_CTRL register access. In particular, SSI mode is enabled in the sequential scan architecture 600 when UFI_MODE=“0”. This is accomplished by disabling production of the UFI mode, and enabling the JTAG mode, as previously described.

As shown in FIG. 6, the clocking for JTAG mode will be from TCK (fast clock). For example, input data is clocked using TCK. A clock divider is configured to divide TCK to generate a slow clock which is used to drive the test controller, sequential codecs, and DSTA load/unload modules. Further, the capture clocks will be from “occ.” As such, the scan chain access during sequential scan compression is accomplished using the SSI interface and JTAG ports.

FIG. 7A is a diagram of SSI mode generation when performing sequential scan compression during system level testing of a chip, in accordance with one embodiment of the present disclosure. FIG. 7B is a diagram of SSI state generation when performing sequential scan compression during system level testing of a chip, in accordance with one embodiment of the present disclosure.

FIG. 4 is a flowchart of a computer-implemented method for the dynamic configuration of DSTA modules associated with a logic block to implement flexible bandwidth ratios for test pattern reuse of the logic block, according to embodiments of the present invention. Although specific steps are disclosed in the flowcharts, such steps are exemplary. That is, embodiments of the present invention are well-suited to performing various other steps or variations of the steps recited in the flowcharts.

FIG. 1A is a block diagram illustrating a DSTA module for a logic block, wherein the DSTA module is configured to implement flexible bandwidth ratios for test pattern reuse of the logic block, in accordance with one embodiment of the present disclosure. In particular, because of the capability of the DSTA module to adjust the external scan data rate and the internal data rate, different bandwidth ratios can be supported for various SoC platforms having the same logic block. This provides for flexibility of adjusting bandwidth ratios during testing of one or more SoC platforms, each having the same logic block. Because the various SoC platforms are capable of being tested by adjusting the bandwidth ratios to support the internal data rate of the logic block, a costly redesign of the logic block is avoided when incorporating the logic block within a new SoC having a different bandwidth ratio than the source SoC for which the logic block was originally designed.

In another embodiment, the DSTA load module 120 and DSTA unload module 125 are integrated at the SoC level, instead of at the logic block level. That is, the SoC at the edge of the device includes the DSTA load and unload modules in anticipation of reusing the above referenced logic block and/or other logic blocks, wherein the SoC may be accessed using different numbers of test access pin counts than was originally anticipated when designing the logic block.

As shown, the logic block 100A includes a plurality of input connections 140 used for test access. For example, the plurality may include “L” input connections. The input connections 140 may be configured as serial scan input (SSI) connections. The SSI connections 140 may be configured for receiving test data used to determine whether the components in the SoC 100A are operating properly. Each of the SSI connections 140 may be dedicated for testing, or may be configured to both serve as a test input and/or to serve some functional purpose other than for testing. For example, during testing an ATE may be delivering test data to the logic block 100A through one or more of the SSI connections 140. The test data may be introduced by the ATE to input connections at the edge of the SoC and internally routed to the SSI connections 140, or may be directly inputted by the ATE to the SSI connections 140. The test data is clocked over the SSI connections 140 using the fast clock signal 115. The fast clock signal 115 is generated from an external clock, such as a clock supplied by the ATE. That is, the ATE may deliver test data using at high clocking frequencies to reduce the amount of time the SoC 100A is undergoing testing.

Further, the logic block 100A includes a plurality of output connections 145. For example, the plurality may include “N” output connections. The output connections 145 may be configured as serial scan output (SSO) connections. The SSO connections 145 may be configured for delivering results from testing the logic block 100A back to the ATE, either directly or as internally routed through an SoC. Each of the SSO connections 145 may be dedicated for testing, and more specifically for delivering test results, or may be configured to both serve as a test output and/or to serve some functional purpose other than for testing. For example, during testing an ATE may be receiving test results from the logic block 100A through one or more of the SSO connections 145. The results may be clocked over the SSO connections 145 again using the fast clock signal 115.

A DSTA load module 120 is coupled to the plurality of SSI connections 140, and is configured to receive the test data. The DSTA load module 120 is configured to deserialize the test data received from the ATE over the SSI connections 140 using a fast clock but narrow width of channels. Specifically, the DSTA load module 120 slows down the delivery of the test data using the slow clock signal 110 in accordance with a bandwidth (BW) ratio that is configurable. The slow clock signal 110 is derived from the fast clock signal 115 (e.g., generated using a clock divider). More particularly, the bandwidth ratio, in part, defines the frequency of the fast clock 115 and the frequency of the slow clock 110. For example, the bandwidth ratio is defined by the following equation:

BW ratio=(frequency of fast clock)/(frequency of slow clock).  (1)

In addition, when deserializing the test data, the DSTA load module 120 spreads the test data across a plurality of channels also in accordance with the bandwidth ratio, as will be further described in relation to FIG. 1B. In particular, the test data is spread over a wider number of channels than those associated with the SSI connections 140, but a slower speed. For example, the test data from one SSI connection that is clocked at a fast frequency of the fast clock signal 115 may be spread to multiple pseudo scan input (PSI) connections 133 clocked at the slow frequency of the slow clock signal 110, in one embodiment. In another embodiment, the test data from one or more SSI connections clocked using the fast clock signal 115 may be spread to multiple PSI connections 133 clocked using the slow clock signal 110 in accordance with the bandwidth ratio. In that case, the bandwidth ratio is also defined by the following equation:

BW ratio=(number of PSI connections)/(number of SSI connections).  (2)

As shown, the plurality of channels internal to the logic block 100A is defined by a plurality of scan chains 130. The channels are also defined by plurality of inputs to scan chains (PSI connections 133) and a plurality of outputs to the scan chains (pseudo scan output (PSO) connections 135). Typically, each scan chain is associated with one PSI and one PSO. The scan chain includes state logic (e.g., flip-flops, latches, etc.) coupled together (e.g., in sequence) by a shift register. For example, the shift register may define a cascade of flip-flops, wherein the output of one flip-flop is coupled to the input of the next flip-flop in the cascade.

As shown in FIG. 1A, a DSTA unload module 125 is coupled to the plurality of PSO connections 135. The DSTA unload module 125 is configured to deliver test results from the plurality of scan chains 130 over the plurality of SSO connections 145. In particular, the DSTA unload module 120 is configured to serialize the test results for delivery to the ATE through the SSO connections 145, either directly to the ATE or routed internally through an SoC. Specifically, the DSTA unload module 125 receives the test results from the outputs of the scan chains over the PSO connections 135 that are clocked using the slow clock signal 110. During serialization, the DSTA unload module 125 speeds up the delivery of the test results over the SSO connections 145 using the fast clock signal 115 in accordance with a bandwidth ratio that is configurable, in one embodiment.

In addition, when serializing the test results, the DSTA unload module 125 consolidates test results received across lanes of scan chains down to a fewer number of SSO connections 145. In one embodiment, the consolidation of test results from the lanes to the SSO connections 145 is performed in accordance with the bandwidth ratio, as will be further described in relation to FIG. 1B. For example, the test results from multiple scan chains clocked using the slow clock signal 110 is consolidated for delivery over one or more SSO connections, in accordance with the bandwidth ratio, that are clocked using the fast clock signal 115. In another embodiment, the consolidation of test results from the lanes to the SSO connections is not performed in accordance with the bandwidth ratio. For example, the test results may include a minimum of data, and can be delivered using a reduced number of SSO connections not in accordance with the bandwidth ratio, such as a single SSO connection.

DSTA used in scan architectures for testing of embodiments of the present invention, with and without test compression, are flexible and can support various clock frequency configurations (e.g., bandwidth ratios). With the flexibility, the logic blocks can be used in one or more SoC platforms, in order to be used for varying, low pin count test modes. This DSTA load module 120 allows flexible bandwidth ratios while keeping the interface standard to be able to reuse the test patterns originally generated for a particular logic block. In that manner, a logic block originally designed for a source SoC and successfully tested may be reused in other derivative SoC platforms. The logic block may be reused even when the derivative SoC has a different number of test input/output connections available in the packet. For instance, the derivative SoC may be a lower number of connections available for testing based on the addressed or targeted markets. Embodiments of the present invention provide for logic block reuse in the derivative SoC with a lesser number of test connections assigned while maintaining the same test patterns for testing the logic block in the derivative SoC.

As such, the DSTA load module 120 and DSTA unload module 125 can support various bandwidth ratios. For illustration, the DSTA modules can be configured in 24-to-1 12-to-1, 8-to-1, 6-to-1, and 4-to-1 bandwidth ratios. As an example, for an 8-to-1 bandwidth ratio, each SSI of the DSTA load module 120 can drive up to eight PSIs each operating eight times slower. The DSTA module design of embodiments of the present invention is not limited to these ratios described for illustration, and can be extended to any ratio. In one embodiment, the DSTA load module 120 and DSTA unload module 125 are local to the logic block. In another embodiment, the DSTA load module 120 and DSTA unload module 125 is local to the SoC and internally routed from the edge of the SoC to the logic block of interest.

FIG. 1B is a block diagram illustrating load and unload DSTA modules of a logic block 100B implementing a particular bandwidth ratio, in accordance with one embodiment of the present disclosure. FIG. 1B is an illustration of one implementation of the logic block 100A of FIG. 1A, wherein the DSTA load module 120 and DSTA unload module 125 of FIG. 1B are configured having a 4-to-1 bandwidth ratio. Other configurations of the DSTA modules allow for differently configured bandwidth ratios supporting the particular logic block.

In particular, the scan_in_0 signal over the SSI input channel 180 is delivered to the DSTA load module 120 using a fast clock signal from an external clock (not shown). For example, the scan_in_0 signal is delivered to the deserializer 155 of the DSTA load module 120, wherein the deserializer 155 is configured to divide the fast clock signal down to a slow clock signal, and spread the test data across multiple PSI input lanes, in accordance with the 4-to-1 bandwidth ratio. For a 4-to-1 bandwidth ratio, the test data inputted from the scan_in_0 signal is spread across four PSI lanes (e.g., PSI-0, PSI-1, PSI-2, and PSI-3), in accordance with the bandwidth ratio. More particularly, the test data across four clock cycles of the fast clock are spread across the four PSI lanes, wherein test data in one of the four clock cycles of the fast clock is delivered to a corresponding PSI lane for delivery to a scan chain over a single clock cycle of the slow clock. As shown, the test data received by PSI-0 is delivered to scan chain 171 after decompression, test data received by PSI-1 is delivered to scan chain 172, test data received by PSI-2 is delivered to scan chain 173, and test data received by PSI-3 is delivered to scan chain 174. The deserialization of test data is further described in relation to FIGS. 3A-E.

The test data may be delivered from the ATE in compressed form. The number of total scan chains, and length of longest chain in scan based design are determined by total number of flops and available number of scan inputs/outputs of the SoC platforms or logic blocks and test channels available on an ATE. Test time reduction can be achieved through test data compression, which reduces the test data volume to be stored on ATE. This is done by driving test stimuli from ATE to multiple internal short chains using on-chip decompression logic and compacting the responses from these internal multiple short chains using on-chip compression logic. With this compression technique, a small number of ATE channels can drive a larger number of shorter internal scan chains, and the depth of each ATE channel is minimized, which can reduce ATE test time significantly.

As a result, the test data over each of the PSI lanes (e.g., PSI-0 through PSI-3) is decompressed before delivery to corresponding scan chains 171-174 using decompression module 150. Also, after testing, the test results are compressed again for the fastest delivery back to the ATE through the DSTA unload module 125 in accordance with the bandwidth ratio. As shown, test results from scan chain 171 is compressed using compression module 155 and delivered over PSO-0 to the serializer 195, test results from scan chain 172 is compressed and delivered over PSO-1 to the serializer 195, test results from scan chain 173 is compressed and delivered over PSO-2 to the serializer 195, and test results from scan chain 174 is compressed and delivered over PSO-3 to the serializer 195.

The serializer 195 is configured to receive test results over the PSO lanes that are clocked using the slow clock, and consolidate the test results for delivery in the single scan_out_0 output signal over the SSO output 185 channel. That is, test results from the scan chains 171-174 are consolidated and delivered over the SSO output 185, in accordance with the bandwidth ratio. For a 4-to-1 bandwidth ratio, the test results received over PSO lanes (e.g., PSO-0, PSO-1, PSO_2, and PSO-3) over one clock cycle of the slow clock signal is collected and delivered as an scan-out-0 signal in the output channel 185 over four clock cycles of a fast clock signal. The serialization of test results is further described in relation to FIGS. 3A-E.

In particular, the SSI input (e.g., scan_in_0) is delivered to the DSTA load module 120 using a fast clock signal from an external clock (not shown). For example, scan_in_0 is delivered to the deserializer 155 of the DSTA load module 120, wherein the deserializer 155 is configured to divide the fast clock signal down to a slow clock signal, and spread the test data across multiple PSI input lanes, in accordance with the 4-to-1 bandwidth ratio. For a 4-to-1 bandwidth ratio, the test data inputted from the scan_in_0 input is spread across four PSI lanes (e.g., PSI-0, PSI-1, PSI-2, and PSI-3), in accordance with the bandwidth ratio. More particularly, the test data across four clock cycles of the fast clock are spread across the four PSI lanes, wherein test data in one of the four clock cycles of the fast clock is delivered to a corresponding PSI lane for delivery to a scan chain over a single clock cycle of the slow clock. As shown, the test data received by PSI-0 is delivered to scan chain 171 after decompression, test data received by PSI-1 is delivered to scan chain 172, test data received by PSI-2 is delivered to scan chain 173, and test data received by PSI-3 is delivered to scan chain 174. The serialization of test data is further described in relation to FIGS. 3A-E.

FIG. 2 is a flow diagram 200 illustrating a method for test pattern reuse for a logic block as integrated within different SoC platforms, wherein the corresponding logic block and/or SoC includes a DSTA module configured to implement flexible bandwidth ratios for test pattern reuse of the logic block, in accordance with embodiments of the present disclosure. In particular, FIG. 2 is a flow diagram 200 illustrating a method for configuring bandwidth ratios between external scan data (from/to ATE) and an internal scan chain data associated with a particular logic block. In this manner, embodiments of the present invention match the data bandwidth between high speed scan data from the ATE and internal scan chains of the logic block. As a benefit, scan pin requirements are reduced to a minimum, and the available channel bandwidth (e.g., speeds of up to 1600 MHz carrying data over an optimum number of channels) carrying ATE test data may be fully utilized. The method outlined in flow diagram 200 is implementable by one or more components of the logic blocks 100A-B of FIGS. 1A-B, or one or more components of a SoC in various embodiments of the present invention.

In that manner, the method of flow diagram 200 allows for a designed test pattern for a particular logic block to be reused in different, derivative SoC platforms that also incorporate that logic block. As such, even though a derivative chip may have a different (e.g., lower) number of access connections available in the package (e.g., SoC) and assigned to that logic block, based on the addressed markets to which the derivate chip is targeted, the test pattern may still be delivered over the access connections by configuring the bandwidth ratio to match the data bandwidth between the high speed external scan data (e.g., from the ATE) and the internal scan chain data of the logic block. By configuring the bandwidth ratio properly, the available channel bandwidth assigned to the ATE may be fully utilized, and the number of scan pin requirements may be reduced to a minimum.

At 210, the method includes generating a first external clock frequency. For example, an external clock may generate the first external clock frequency. In one embodiment, the external clock may be generated by an ATE that is used for testing the logic block that is integrated into various SoC platforms.

At 220, the method includes supplying test data over a first plurality of SSI connections clocked at the first external clock frequency. The test data is designed for testing a particular logic block, and may be generated through an automatic test pattern generation tool. The test data may be driven by an ATE when testing the logic block. For example, the test data may be input to a plurality of scan chains of the logic block and subsequently read out from the scan chains to determine whether there are errors within the components of the scan chain.

In general, test data is delivered to the internal circuits of a device (e.g., SoC and/or logic block of an SoC), wherein state logic (e.g., flip-flops, latches, etc.) are connected together in a plurality of scan chains. For example, a long shift register connects the flip-flops in a corresponding scan chain. The scan chains are used to access the internal nodes of the circuit. The test data is shifted into the scan chains, and clocked through the scan chain at the internal scan chain data rate during capture cycles. The results are then shifted out from the device and delivered to the output connections of the corresponding SoC.

At 230, the method includes configuring a DSTA module for the logic block that is integrated within a first chip to a first bandwidth ratio. The first bandwidth ratio is configurable to match the delivery of the external scan data (e.g., the rate at which the test data is delivered from the ATE) and the delivery of the internal scan chain data (e.g., the rate the test data is delivered over the internal scan chains of the logic block). The matching may occur on a frequency level and/or channel width level. For example, the first bandwidth ratio defines the first plurality of SSI connections and a first plurality of PSI connections of the logic block, as previously described.

At 240, the method includes dividing the first external clock frequency down using the first bandwidth ratio to generate a first internal clock frequency. For example, a clock divider may perform the dividing process. The dividing is performed in accordance with the first bandwidth ratio, wherein the ratio also defines or is based on the rate at which the external scan data is input using the SSI connections, and the rate or frequency at which the internal scan chain data is clocked into the scan chains, as previously described. That is, the first bandwidth ratio also defines the first external clock frequency and the first internal clock frequency, which is based on the number of first plurality of SSI connections used for test access and the number of first plurality of PSI connections of the logic block.

In one embodiment, the matching of bandwidths (e.g., the external scan data rate and the internal scan chain data rate) is performed by changing the external scan data rate, while keeping the internal scan chain data rate the same, as originally designed for integration within a source SoC. For instance, the internal scan chain data rate is not varied in any derivative SoC incorporating the logic block, and the number of PSI connections used for testing in the logic block remain fixed. In another embodiment, the matching of bandwidths is performed by keeping the external data rate the same, and changing internal scan chain data rate. That is, the number of PSI connections used for testing in the logic block may change from one derivative SoC to another derivative SoC, both incorporating the logic block. In still another embodiment, the matching of bandwidths is performed by modifying both the external data rate and the internal scan data rate. As such, the utilization of channel bandwidth (e.g., for receiving input test data from the ATE) is maximized and/or optimized. In addition, by having the ability to change either the external data rate and/or the internal scan chain data rate, the scan shift into the internal scan chains can be run at a desired speed that is based on timing signoff.

At 250, the method includes scanning the test data over the first plurality of PSI connections clocked at the first internal clock frequency. The PSI connections are configured for inputting the test data into the scan chains of the logic block. The scanning is performed in accordance with the first bandwidth ratio. That is, the scanning of the test data, received over faster but narrower connections (e.g., SSIs), is performed over slower but wider connections (e.g., PSIs), wherein the scan rate and the number of internal connections is in accordance with the bandwidth ratio.

More particularly, the test data is collected over a first number of external clock cycles. That is, for each SSI connection, test data is collected over the first number of external clock cycles. Because of the relationships between frequencies and clock cycles, the first bandwidth ratio also defines the higher, first number of external clock cycles running at the first external clock frequency and a lower, first number of internal clock cycles running at the first internal clock frequency. As such, the test data collected over all the SSI connections over the first number of external clock cycles is then scanned over the first plurality of PSI connections over the first number of internal clock cycles.

That is, test data collected over each SSI connection is scanned over a corresponding number of PSI connections, in accordance with the bandwidth ratio. In particular, for each SSI connection, a corresponding first subset of test data is collected over the first number of external clock cycles. Thereafter, the corresponding first subset of test data is scanned over a corresponding first number of PSI connections over the first number of internal clock cycles using the internal clock frequency, wherein the first bandwidth ratio defines the first number of PSI connections, and the first number of SSI connections (in this case—one). For example, for a 4-to-1 ratio, the test data collected over one SSI connection, and over four fast clock cycles, is spread out over four PSI connections, and over one slow clock cycle, wherein the number of internal clock cycles is equal to one. The bandwidth ratio may be performed using various ratios having denominators equal to or greater than one, and other techniques, such as channel and frequency multiplexing, in embodiments of the present invention.

In addition, the plurality of scan chains will output results during testing. That is, concurrent with or after the test data is shifted into the scan chains, results are then shifted out from the logic block using the PSO connections and routed to the system output connections, such as output connections of the corresponding SoC. Specifically, output data clocked at the internal clock frequency is received from the plurality of scan chains over a first plurality of PSO connections associated with the logic block. The output data is serialized and scanned over a plurality of SSO connections clocked at the first external clock frequency.

In one embodiment, the output data is serialized in accordance with the first bandwidth ratio. As such, the first bandwidth ratio defines the first plurality of PSO connections and the first plurality of SSO connections. Since the deserialization and serialization are performed in accordance with the first bandwidth ratio, the number of SSI connections is equal to the number of SSO connections, and the number of PSI connections is equal to the number of PSO connections (which also defines the number of scan chains). In that manner, the output data is scanned to the SSO connections according to the first bandwidth ratio, such that the output data is collected from the first plurality of PSO connections over said first number of internal clock cycles at said first internal clock frequency, and scanned over the first plurality of SSO connections over said first number of external clock cycles at the first external clock frequency. For example, for a 4-to-1 ratio, the test results collected over four PSO connections, and over one slow clock cycle, is collected and delivered over one SSO connection over one four fast clock cycles, wherein the number of internal clock cycles is equal to one.

For clarification, the bandwidth ratio defines a number of interrelated factors and/or components. As previously described, the bandwidth ratio defines the external frequency used to input test data from the ATE and the internal scan chain frequency used to scan test data across the scan chains in the logic block; the number of SSI to PSI connections; the number of external clocks used for collecting test data across the SSI connections and the number of internal clocks used for scanning test data across the PSI connections; the number of PSO connections to SSO connections, the number of internal clocks used for collecting test results across the PSO connections and the number of external clocks used for scanning test results across the SSO connections; and the internal scan chain frequency used to deliver test results across the PSO connections and the external frequency used to scan test results across the SSO connections.

In another embodiment, the output data is serialized in general. That is, serialization does not occur in accordance with the first bandwidth ratio. For example, the testing may be designed to produce a small amount of data. Instead of using all of the available SSO connections for outputting test results, a fewer number of SSO connections are utilized, and possibly over a fewer number of clock cycles. That is, the bandwidth ratio on the output side may be different than the bandwidth ratio on the input side of the logic block.

In one embodiment, bi-directional communication is implemented, such that the first plurality of SSI connections of the logic block includes at least one of the plurality of SSO connections. With bi-directional control, a connection may provide both input and output functionality. In that case, an SSI/SSO connection may receive input test data at one time, and output test results at a different time.

Because the DSTA modules provide configurable bandwidth ratios to be used for a particular logic block, that logic block may be incorporated into a derivative SoC possibly having a different number of access connections. In addition, the test pattern designed for testing the logic block incorporated into the source SoC may also be used for testing eh logic block that is now incorporated into the derivative SoC. By configuring the DSTA models associated with the logic block (e.g., either at the local level in the logic block, or at the system level in the SoC incorporating the logic block), the bandwidth ratio selected will match the external data rate to the internal scan data rate given the number of physically available external access connections, though not all may be utilized. For example, a DSTA module will be configured for the logic block that is integrated within a second, derivative chip to a second bandwidth ratio. The second bandwidth ratio defines a second plurality of SSI connections and a second plurality of PSI connections. The SSI access connections are used for inputting the test results from the ATE, such that the second plurality of SSI connections is configured for receiving the test data clocked at a second external clock frequency. In addition, the second plurality of PSI connections is configured for inputting the test data to the plurality of scan chains according to the second bandwidth ratio. In addition, the second bandwidth ratio also defines the second external clock frequency and the second internal clock frequency.

In one embodiment, the first plurality of PSI connections is identical to the second plurality of PSI connections, even though the first and second bandwidth ratios may be different. That is, internal to the logic block, the same number of physical PSI connections are used to input test data to the scan chains. To optimize testing, all of the PSI connections should be utilized given the different bandwidth ratios, or at least a maximum number of PSI should be utilized.

FIGS. 3A-E are block diagrams illustrating the implementation of DSTA modules for a logic block design integrated within five different SoC configurations, wherein the DSTA module is configured to implement flexible bandwidth ratios for test pattern reuse such that each SoC configuration is accessed using a corresponding bandwidth ratio, in accordance with embodiments of the present disclosure. That is, individual logic blocks of similar design can be run at different bandwidth ratios depending on the access abilities of different SoC platforms incorporating those logic blocks, and how timing signoff is accomplished for those logic blocks.

The logic blocks shown in FIGS. 3A-E are similar in that they contain the same integrated circuit configuration; however, each is configured with a different bandwidth ratio to match corresponding an external data rate from an ATE to an internal scan data rate of the logic block—as they are incorporated in different SoC platforms. FIGS. 3A-E are provided for illustration of the implementation of DSTA load and unload modules for configuring bandwidth ratios between the external scan data rate and the internal scan chain data rate associated with a particular logic block. That is, while certain bandwidth ratios are provided for illustration in FIGS. 3A-E, other differently configured bandwidth ratios for the particular logic block may be supported. By providing flexible and configurable bandwidth ratios, scan access pin requirements can be further reduced while still matching the data bandwidths between high speed scan data rate from ATE and internal scan chains data rate in order to fully utilize the available channel bandwidths of ATE data, load deserializer, and unload serializer scan architectures.

For each logic block shown in FIGS. 3A-E, the DSTA modules may be built on an existing fixed bandwidth ratio design, wherein embodiments of the present invention allow for configuring the bandwidth ratios to match the data bandwidth between high speed scan data from the ATE and internal scan chains of the logic block, as the logic block is incorporated into a corresponding SoC. For example, for logic block reuse, where the logic block is taken into a SoC with lesser connections (compared to the source/original SoC for which the logic block was designed, and where the test patterns were originally generated), the DSTA modules can be configured to operate at a higher ratio to match the high speed scan data rate from ATE and the internal scan chains data rate. In that manner, the existing test pattern for testing the logic block may be used for multiple SoC configurations with some formatting (e.g., configuring the bandwidth ratio).

It is understood, however, that in other embodiments, the DSTA modules are built on logic blocks that have flexible bandwidth ratio designs, in which case embodiments of the present invention still allow for configuring the bandwidth ratios to match the data bandwidth between high speed scan data from the ATE and internal scan chains of the logic block, as the logic block is incorporated into a corresponding SoC. For instance, the internal scan chain data rate may be configurable within a particular logic block design.

In one embodiment the connections internal to the logic block (e.g., PSI/PSO) may remain fixed in each of the reuse cases shown in FIGS. 3A-E, although not all of the connections may be utilized in order to satisfy a corresponding bandwidth ratio. The unused SSIs/SSOs (which are connected to the SI/SO of the logic block) can be tied off or unconnected inside the “SOC NVSCANLINK,” in one embodiment. In this case, the loss in SSI/SSO is made up for by increasing the frequency of the fast clock, which in turn increases the rate at which SSI/SSO can be supplied or consumed. In other embodiment, the connections internal to the logic block may also be configurable to support various bandwidth ratios.

As shown in FIGS. 3A-E, internal to the logic block, scan chains connected to the PSI/PSO connections can operate at a fixed, slow clock frequency as timed originally for the source SoC. A fast clock to slow clock divider is configured to accomplish this. For example, on the tester of the ATE, the probe card (during wafer level test) can supply a 500 MHz fast clock. To support 24-to-1, 12-to-1, 8-to-1, 6-to-1 and 4-to-1 bandwidth ratios, the clock dividers need to support those same ratios. The slow clock signal 302 is fixed and operated at 62.5 MHz in FIGS. 3A-E. Based on the dynamic ratio requirements of this modules, the fast clock can be operated at 500/375/250 MHz from SoC to SoC. In other embodiments, the slow clock frequency may be configurable so that the data rate for the scan chains connected to PSI/PSO connections can operate at different frequencies, and be supported by different bandwidth ratios.

FIG. 3A shows the implementation of DSTA modules within a logic block 300A configured to operate at a default 4-to-1 bandwidth ratio, wherein each fast scan input (SSI) can drive up to four internal PSI connections of the logic block 300A. In accordance with the 4-to-1 bandwidth ratio, the SSI connections are operating four times faster than the PSI connections.

As shown a first plurality of SSI connections is configured for supplying test data clocked at a first external clock frequency by an external clock signal 301 (e.g., provided by ATE). The test data is designed for testing a logic block when input to a plurality of scan chains of the logic block 300A. For example, six SSI connections (e.g., SSI-0, SSI-1, SSI-2, SSI-3, SSI-4, and SSI-5) are loaded with representative test data (e.g., “A-B-C-D”). For illustration, “A-B-C-D” test data is input into each SSI connection over four clock cycles in accordance with the bandwidth ratio. However, it is understood that each SSI connection may include different test data, such that SSI-0 may contain test data “A0-B0-C0-D-0”, and SSI-1 may contain test data “A1-B-1-C-1-D-1,” etc.

As previously described, the test data is input into a DSTA load module 330, which in FIG. 3A is configured to operate at a 4-to-1 bandwidth ratio. The DSTA load module 330 deserializes the test data to spread the data over more PSI connections at a slower rate. As a representative example of each of the SSI connections, the test data over the SSI-0 connection and collected over four clock cycles of the fast clock signal 301 (e.g., containing “A-B-C-D” test data) is delivered over four PSI connections (e.g., PSI-(0-3) over one clock cycle of the slow clock signal 302, as highlighted by the dashed circle 303. Given the 4-to-1 bandwidth ratio of FIG. 3A, the test data delivered over six SSI connections (e.g., SSI-(0-5) are spread across twenty-four PSI channels (e.g., PSI-(0-23)). Since the same test pattern is reused in each of the logic blocks 300A-E of FIGS. 3A-E, the test data is scanned into the scan chains 310 identically from PSI-(0-23) in each of the logic blocks 300A-E using the same slow clock signal 302. In this case, all of the physical SSI connections are utilized to satisfy the 4-to-1 bandwidth ratio.

As shown in FIGS. 3A-E, in an attempt to standardize the connectivity in this specific use case, the number of PSI and PSO connections per DSTA module pair (load and/or unload) is fixed at twenty-four. As such, the dynamic bandwidth ratio can be configured identically for all logic blocks (of similar design) in a particular SoC. In another embodiment, the DSTA module pairs may support different bandwidth ratios across the logic block for lower channel depth resulting in low probe card memories.

The deserialized test data from the PSI connections (e.g., PSI-(0-23)) is optionally decompressed by the on chip test decompression module 320, if the test data was originally compressed by the ATE. After decompression, the test data is delivered from the PSI connections to the corresponding scan chins 310. As shown, the PSI connections and scan chains are configured in a one-to-one relationship.

Embodiments of the present invention providing configurable DSTA modules associated with a particular logic block to achieve different bandwidth ratios are implemented with logic blocks configured for decompression/compression, or for logic blocks that do not implement any decompression/compression of the test data and test results during production testing or online system level testing. In addition, the logic blocks configured with decompression/compression of the test data and test results can support various types of scan compression/decompression techniques.

The test results shifting out of the scan chains 310 is then optionally compressed by the on chip test compression module 325 and delivered to corresponding PSO connections, if the logic block is configured for compression/decompression. For example, after compression the test results associated with the test data input over PSI-(0-3) connections are output from the scan chains over the PSO-(0-3) connections. As shown, the PSI connections, scan chains, and PSO connections are configured in a one-to-one-to-one relationship.

After compression, the test results are delivered from the PSO-(0-23) connections to the DSTA unload module 340, which in FIG. 3A is configured to operate at a 4-to-1 bandwidth ratio. The DSTA unload module 340 serializes the test results to consolidate the test results over fewer SSO connections, but at a higher rate. As a representative example of each of the SSO connections, the test results collected over PSO-(0-3) connections containing “A-B-C-D” test results are serialized for delivery over the SSI-0 connection, as highlighted by the dashed circle 304. For example, the test results over four PSO-(0-3) connections and collected over one clock cycle of the slow clock signal 302 are outputted over a single SSO-0 connection over four cycles of the fast external clock signal 301. Given the 4-to-1 bandwidth ratio of FIG. 3A, the test results delivered over twenty-four PSO-(0-23) connections are consolidated across six SSO-(0-5) connections. In this case, all of the physical SSO connections are utilized to satisfy the 4-to-1 bandwidth ratio.

FIG. 3B shows the implementation of DSTA modules within a logic block 300B configured to operate at a 6-to-1 bandwidth ratio, wherein each fast scan input (SSI) can drive up to six internal PSI connections of the logic block 300B. In accordance with the 6-to-1 bandwidth ratio, the SSI connections are operating six times faster than the PSI connections. The logic block 300B is of the same design as logic block 300A of FIG. 3A (e.g., include the same components PSI, compression/decompression modules, scan chains, PSO, DSTA modules, etc.), but is configured for a 6-to-1 bandwidth ratio.

As shown, a first plurality of SSI connections is configured for supplying test data clocked at a first external clock frequency by an external clock signal 311 (e.g., provided by ATE). The first plurality of SSI connections utilized is less than the number of physically available SSI connections to satisfy the 6-to-1 bandwidth ratio. The test data is designed for testing a logic block when input to a plurality of scan chains of the logic block 300B. For example, four SSI connections (e.g., SSI-0, SSI-1, SSI-2, and SSI-3) are loaded with representative test data (e.g., “A-B-C-D”) in various configurations over six clock cycles in accordance with the bandwidth ratio, as follows: “A-B-C-D-A-B” test data is input into SSI-0; “C-D-A-B-C-D” test data is input into SSI-1; “A-B-C-D-A-B” test data is input into SSI-2, and “C-D-A-B-C-D” test data is input into SSI-3. However, it is understood that each SSI connection may include different test data, such that SSI-0 may contain test data “A0-B0-C0-D0-A1-B1”, and SSI-1 may contain test data “C1-D1-A2-B2-C2-D2,” etc.

As previously described, the test data is input into a DSTA load module 330, which in FIG. 3B is configured to operate at a 6-to-1 bandwidth ratio. The DSTA load module 330 deserializes the test data to spread the data over more PSI connections at a slower rate. As a representative example of each of the SSI connections, the test data over the SSI-0 connection and collected over six clock cycles of the fast clock signal 311 (e.g., containing “A-B-C-D-A-B” test data) is delivered over six PSI connections (e.g., PSI-(0-5)) over one clock cycle of the slow clock signal 302, as highlighted by the dashed circle 313. Given the 6-to-1 bandwidth ratio of FIG. 3B, the test data delivered over four SSI channels (e.g., SSI-(0-3) are spread across twenty-four PSI channels (e.g., PSI-(0-23)). Since the same test pattern is reused in each of the logic blocks 300A-E of FIGS. 3A-E, the test data is scanned into the scan chains 310 identically from PSI-(0-23) in each of the logic blocks 300A-E using the same slow clock signal 302. In this case, four of the six physical SSI connections are utilized to satisfy the 6-to-1 bandwidth ratio.

The deserialized test data from the PSI connections (e.g., PSI-(0-23)) is optionally decompressed by the on chip test decompression module 320, if the test data was originally compressed by the ATE. After decompression, the test data is delivered from the PSI connections to the corresponding scan chins 310. As shown, the PSI connections and scan chains are configured in a one-to-one relationship.

The test results shifting out of the scan chains 310 is then optionally compressed by the on chip test compression module 325 and delivered to corresponding PSO connections, if the logic block is configured for compression/decompression. For example, after compression the test results associated with the test data input over PSI-(0-5) connections are output from the scan chains over the PSO-(0-5) connections. As shown, the PSI connections, scan chains, and PSO connections are configured in a one-to-one-to-one relationship.

After compression, the test results are delivered from the PSO-(0-23) connections to the DSTA unload module 340, which in FIG. 3B is configured to operate at a 6-to-1 bandwidth ratio. The DSTA unload module 340 serializes the test results to consolidate the test results over fewer SSO connections, but at a higher rate. As a representative example of each of the SSO connections, the test results collected over PSO-(0-5) connections containing “A-B-C-D-A-B” test results are serialized for delivery over the SSI-0 connection, as highlighted by the dashed circle 314. For example, the test results over six PSO-(0-5) connections and collected over one clock cycle of the slow clock signal 302 are outputted over a single SSO-0 connection over six cycles of the fast external clock signal 311. Given the 6-to-1 bandwidth ratio of FIG. 3B, the test results delivered over twenty-four PSO-(0-23) connections are consolidated across four SSO-(0-3) connections. In this case, four of the six physical SSO connections are utilized to satisfy the 6-to-1 bandwidth ratio.

FIG. 3C shows the implementation of DSTA modules within a logic block 300C configured to operate at an 8-to-1 bandwidth ratio, wherein each fast scan input (SSI) can drive up to eight internal PSI connections of the logic block 300C. In accordance with the 8-to-1 bandwidth ratio, the SSI connections are operating eight times faster than the PSI connections. The logic block 300C is of the same design as logic blocks 300A-B of FIGS. 3A-B (e.g., include the same components PSI, compression/decompression modules, scan chains, PSO, DSTA modules, etc.), but is configured for an 8-to-1 bandwidth ratio.

As shown, a first plurality of SSI connections is configured for supplying test data clocked at a first external clock frequency by an external clock signal 321 (e.g., provided by ATE). The first plurality of SSI connections utilized is less than the number of physically available SSI connections to satisfy the 8-to-1 bandwidth ratio. The test data is designed for testing a logic block when input to a plurality of scan chains of the logic block 300C. For example, three SSI connections (e.g., SSI-0, SSI-1, and SSI-2) are loaded with representative test data (e.g., “A-B-C-D”, etc.) in various configurations over eight clock cycles in accordance with the bandwidth ratio, as follows: “A-B-C-D-A-B-C-D” test data is input into SSI-0; “A-B-C-D-A-B-C-D” test data is input into SSI-1; and “A-B-C-D-A-B-C-D” test data is input into SSI-2. However, it is understood that each SSI connection may include different test data, such that SSI-0 may contain test data “A0-B0-C0-D0-A1-B1-C1-D1”, and SSI-1 may contain test data “A2-B2-C2-D2-A3-B3-C3-D3,” etc.

As previously described, the test data is input into a DSTA load module 330, which in FIG. 3C is configured to operate at an 8-to-1 bandwidth ratio. The DSTA load module 330 deserializes the test data to spread the data over more PSI connections at a slower rate. As a representative example of each of the SSI connections, the test data over the SSI-0 connection and collected over eight clock cycles of the fast clock signal 321 (e.g., containing “A-B-C-D-A-B-C-D” test data) is delivered over eight PSI connections (e.g., PSI-(0-7)) over one clock cycle of the slow clock signal 302, as highlighted by the dashed ellipse 323. Given the 8-to-1 bandwidth ratio of FIG. 3C, the test data delivered over three SSI channels (e.g., SSI-(0-2) are spread across twenty-four PSI channels (e.g., PSI-(0-23)). Since the same test pattern is reused in each of the logic blocks 300A-E of FIGS. 3A-E, the test data is scanned into the scan chains 310 identically from PSI-(0-23) in each of the logic blocks 300A-E using the same slow clock signal 302. In this case, three of the six physical SSI connections are utilized to satisfy the 8-to-1 bandwidth ratio.

The deserialized test data from the PSI connections (e.g., PSI-(0-23)) is optionally decompressed by the on chip test decompression module 320, if the test data was originally compressed by the ATE. After decompression, the test data is delivered from the PSI connections to the corresponding scan chins 310. As shown, the PSI connections and scan chains are configured in a one-to-one relationship.

The test results shifting out of the scan chains 310 is then optionally compressed by the on chip test compression module 325 and delivered to corresponding PSO connections, if the logic block is configured for compression/decompression. For example, after compression the test results associated with the test data input over PSI-(0-7) connections are output from the scan chains over the PSO-(0-7) connections. As shown, the PSI connections, scan chains, and PSO connections are configured in a one-to-one-to-one relationship.

After compression, the test results are delivered from the PSO-(0-23) connections to the DSTA unload module 340, which in FIG. 3C is configured to operate at an 8-to-1 bandwidth ratio. The DSTA unload module 340 serializes the test results to consolidate the test results over fewer SSO connections, but at a higher rate. As a representative example of each of the SSO connections, the test results collected over PSO-(0-7) connections containing “A-B-C-D-A-B-C-D” test results are serialized for delivery over the SSI-0 connection, as highlighted by the dashed ellipse 324. For example, the test results over eighth PSO-(0-7) connections and collected over one clock cycle of the slow clock signal 302 are outputted over a single SSO-0 connection over eight cycles of the fast external clock signal 321. Given the 8-to-1 bandwidth ratio of FIG. 3C, the test results delivered over twenty-four PSO-(0-23) connections are consolidated across three SSO-(0-2) connections. In this case, three of the six physical SSO connections are utilized to satisfy the 8-to-1 bandwidth ratio.

FIG. 3D shows the implementation of DSTA modules within a logic block 300D configured to operate at a 12-to-1 bandwidth ratio, wherein each fast scan input (SSI) can drive up to twelve internal PSI connections of the logic block 300D. In accordance with the 12-to-1 bandwidth ratio, the SSI connections are operating twelve times faster than the PSI connections. The logic block 300D is of the same design as logic blocks 300A-C of FIGS. 3A-C (e.g., include the same components PSI, compression/decompression modules, scan chains, PSO, DSTA modules, etc.), but is configured for a 12-to-1 bandwidth ratio.

As shown, a first plurality of SSI connections is configured for supplying test data clocked at a first external clock frequency by an external clock signal 331 (e.g., provided by ATE). The first plurality of SSI connections utilized is less than the number of physically available SSI connections to satisfy the 12-to-1 bandwidth ratio. The test data is designed for testing a logic block when input to a plurality of scan chains of the logic block 300D. For example, two SSI connections (e.g., SSI-0 and SSI-1) are loaded with representative test data (e.g., “A-B-C-D”, etc.) in various configurations over twelve clock cycles in accordance with the bandwidth ratio, as follows: “A-B-C-D-A-B-C-D-A-B-C-D” test data is input into SSI-0; and “A-B-C-D-A-B-C-D-A-B-C-D” test data is input into SSI-1. However, it is understood that each SSI connection may include different test data, such that SSI-0 may contain test data “A0-B0-C0-D0-A1-B1-C1-D1-A2-B2-C2-D2”, and SSI-1 may contain test data “A3-B3-C3-D3-A4-B4-C4-D4-A5-B5-C5-D5,” etc.

As previously described, the test data is input into a DSTA load module 330, which in FIG. 3D is configured to operate at a 12-to-1 bandwidth ratio. The DSTA load module 330 deserializes the test data to spread the data over more PSI connections at a slower rate. As a representative example of each of the SSI connections, the test data over the SSI-0 connection and collected over twelve clock cycles of the fast clock signal 331 (e.g., containing “A-B-C-D-A-B-C-D-A-B-C-D” test data) is delivered over twelve PSI connections (e.g., PSI-(0-11)) over one clock cycle of the slow clock signal 302, as highlighted by the dashed ellipse 333. Given the 12-to-1 bandwidth ratio of FIG. 3D, the test data delivered over two SSI channels (e.g., SSI-(0-1) are spread across twenty-four PSI channels (e.g., PSI-(0-23)). Since the same test pattern is reused in each of the logic blocks 300A-E of FIGS. 3A-E, the test data is scanned into the scan chains 310 identically from PSI-(0-23) in each of the logic blocks 300A-E using the same slow clock signal 302. In this case, two of the six physical SSI connections are utilized to satisfy the 12-to-1 bandwidth ratio.

The deserialized test data from the PSI connections (e.g., PSI-(0-23)) is optionally decompressed by the on chip test decompression module 320, if the test data was originally compressed by the ATE. After decompression, the test data is delivered from the PSI connections to the corresponding scan chins 310. As shown, the PSI connections and scan chains are configured in a one-to-one relationship.

The test results shifting out of the scan chains 310 is then optionally compressed by the on chip test compression module 325 and delivered to corresponding PSO connections, if the logic block is configured for compression/decompression. For example, after compression the test results associated with the test data input over PSI-(0-11) connections are output from the scan chains over the PSO-(0-11) connections. As shown, the PSI connections, scan chains, and PSO connections are configured in a one-to-one-to-one relationship.

After compression, the test results are delivered from the PSO-(0-23) connections to the DSTA unload module 340, which in FIG. 3D is configured to operate at a 12-to-1 bandwidth ratio. The DSTA unload module 340 serializes the test results to consolidate the test results over fewer SSO connections, but at a higher rate. As a representative example of each of the SSO connections, the test results collected over PSO-(0-11) connections containing “A-B-C-D-A-B-C-DA-B-C-D” test results are serialized for delivery over the SSI-0 connection, as highlighted by the dashed ellipse 334. For example, the test results over twelve PSO-(0-11) connections and collected over one clock cycle of the slow clock signal 302 are outputted over a single SSO-0 connection over twelve cycles of the fast external clock signal 331. Given the 12-to-1 bandwidth ratio of FIG. 3D, the test results delivered over twenty-four PSO-(0-23) connections are consolidated across two SSO-(0-1) connections. In this case, two of the six physical SSO connections are utilized to satisfy the 12-to-1 bandwidth ratio.

FIG. 3E shows the implementation of DSTA modules within a logic block 300E configured to operate at a 24-to-1 bandwidth ratio, wherein each fast scan input (SSI) can drive up to twenty-four internal PSI connections of the logic block 300E. In accordance with the 24-to-1 bandwidth ratio, the SSI connections are operating twenty-four times faster than the PSI connections. The logic block 300E is of the same design as logic blocks 300A-D of FIGS. 3A-D (e.g., include the same components PSI, compression/decompression modules, scan chains, PSO, DSTA modules, etc.), but is configured for a 24-to-1 bandwidth ratio.

As shown, a first plurality of SSI connections is configured for supplying test data clocked at a first external clock frequency by an external clock signal 341 (e.g., provided by ATE). The first plurality of SSI connections utilized is less than the number of physically available SSI connections to satisfy the 24-to-1 bandwidth ratio. The test data is designed for testing a logic block when input to a plurality of scan chains of the logic block 300E. For example, a single SSI connection (e.g., SSI-0) is loaded with representative test data (e.g., “A-B-C-D”, etc.) in various configurations over twenty-four clock cycles in accordance with the bandwidth ratio, as follows: “A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D” test data is input into SSI-0.

As previously described, the test data is input into a DSTA load module 330, which in FIG. 3E is configured to operate at a 24-to-1 bandwidth ratio. The DSTA load module 330 deserializes the test data to spread the data over more PSI connections at a slower rate. As a representative example, the test data over the SSI-0 connection and collected over twenty-four clock cycles of the fast clock signal 341 (e.g., containing “A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D” test data) is delivered over twenty-four PSI connections (e.g., PSI-(0-23)) over one clock cycle of the slow clock signal 302. Given the 24-to-1 bandwidth ratio of FIG. 3E, the test data delivered over the single SSI-0 is spread across twenty-four PSI channels (e.g., PSI-(0-23)). Since the same test pattern is reused in each of the logic blocks 300A-E of FIGS. 3A-E, the test data is scanned into the scan chains 310 identically from PSI-(0-23) in each of the logic blocks 300A-E using the same slow clock signal 302. In this case, one of the six physical SSI connections are utilized to satisfy the 24-to-1 bandwidth ratio.

The deserialized test data from the PSI connections (e.g., PSI-(0-23)) is optionally decompressed by the on chip test decompression module 320, if the test data was originally compressed by the ATE. After decompression, the test data is delivered from the PSI connections to the corresponding scan chins 310. As shown, the PSI connections and scan chains are configured in a one-to-one relationship.

The test results shifting out of the scan chains 310 is then optionally compressed by the on chip test compression module 325 and delivered to corresponding PSO connections, if the logic block is configured for compression/decompression. For example, after compression the test results associated with the test data input over PSI-(0-23) connections are output from the scan chains over the PSO-(0-23) connections. As shown, the PSI connections, scan chains, and PSO connections are configured in a one-to-one-to-one relationship.

After compression, the test results are delivered from the PSO-(0-23) connections to the DSTA unload module 340, which in FIG. 3E is configured to operate at a 24-to-1 bandwidth ratio. The DSTA unload module 340 serializes the test results to consolidate the test results over fewer SSO connections, but at a higher rate. As a representative example of each of the SSO connections, the test results collected over PSO-(0-23) connections containing “A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D-A-B-C-D” test results are serialized for delivery over the SSI-0 connection. For example, the test results over twenty-four PSO-(0-23) connections and collected over one clock cycle of the slow clock signal 302 are outputted over a single SSO-0 connection over twenty-four cycles of the fast external clock signal 341. Given the 24-to-1 bandwidth ratio of FIG. 3E, the test results delivered over twenty-four PSO-(0-23) connections are consolidated across one SSO-0 connection. In this case, one of the six physical SSO connections are utilized to satisfy the 24-to-1 bandwidth ratio.

FIG. 4 is a diagram illustrating test pattern reuse of a logic block through the implementation of flexible bandwidth ratios of one or more DSTA modules integrated within the logic block, in accordance with one embodiment of the present disclosure. As shown, netlist patterns 420 having various bandwidth ratios (e.g., 4X, 8X, 12X, and 24X) are input into the test pattern generator 410. The test pattern generated is independent of the netlist patterns having different bandwidth ratios, but is formatted for proper input into the SSI connections. The output of the test pattern generator 410 provides various formats, including an eX Mode STIL Pattern, and a Legacy Mode Stil Pattern.

FIG. 5 is a diagram illustrating the broadcasting of a test pattern to a plurality of logic blocks 530 similarly designed, but included within a single SoC 500, wherein each logic block contained within the SoC 500 includes a DSTA implementing flexible bandwidth ratios for test pattern reuse under the single SoC 500 platform, in accordance with one embodiment of the present disclosure. For example, SoC 500 may include multiple logic blocks (e.g., 530A-N) of the same design (e.g., include the same components PSI, compression/decompression modules, scan chains, PSO, DSTA modules, etc.). Each of the logic blocks 530A-N can be tested using the same test data or pattern as scanned using the same bandwidth ratio. As shown in FIG. 5, the test data is broadcast at the system level (SoC 500) by the broadcast module 520 after the test data is received from the ATE. For example, the test data is received over “L” input connections (e.g., SSI) at the system level. The test data is internally routed to the broadcast module 520, and then broadcast over sets of L connections to each of the multiple logic blocks 530A-N. For example, the test data is broadcast over a super plurality of SSI connections associated with the logic blocks 530A-N. The test data is clocked at a first external clock frequency (e.g., fast clock provided by the ATE). The super plurality of SSI connections includes the first plurality of SSI connections (e.g., numbering “L”) used for accessing logic block 330A, a second plurality of SSI connections (numbering “L”) used for accessing logic block 330B, and other pluralities of SSI connections configured for testing other similarly designed logic blocks.

Thus, according to embodiments of the present disclosure, systems and methods are described providing for implementing a scan compression architecture for online logic testing at the system level.

While the foregoing disclosure sets forth various embodiments using specific block diagrams, flowcharts, and examples, each block diagram component, flowchart step, operation, and/or component described and/or illustrated herein may be implemented, individually and/or collectively, using a wide range of hardware, software, or firmware (or any combination thereof) configurations. In addition, any disclosure of components contained within other components should be considered as examples in that many architectural variants can be implemented to achieve the same functionality.

The process parameters and sequence of steps described and/or illustrated herein are given by way of example only and can be varied as desired. For example, while the steps illustrated and/or described herein may be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed. The various example methods described and/or illustrated herein may also omit one or more of the steps described or illustrated herein or include additional steps in addition to those disclosed.

While various embodiments have been described and/or illustrated herein in the context of fully functional computing systems, one or more of these example embodiments may be distributed as a program product in a variety of forms, regardless of the particular type of computer-readable media used to actually carry out the distribution. The embodiments disclosed herein may also be implemented using software modules that perform certain tasks. These software modules may include script, batch, or other executable files that may be stored on a computer-readable storage medium or in a computing system. These software modules may configure a computing system to perform one or more of the example embodiments disclosed herein. One or more of the software modules disclosed herein may be implemented in a cloud computing environment. Cloud computing environments may provide various services and applications via the internet. These cloud-based services (e.g., software as a service, platform as a service, infrastructure as a service, etc.) may be accessible through a Web browser or other remote interface. Various functions described herein may be provided through a remote desktop environment or any other cloud-based computing environment.

The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as may be suited to the particular use contemplated.

Embodiments according to the present disclosure are thus described. While the present disclosure has been described in particular embodiments, it should be appreciated that the disclosure should not be construed as limited by such embodiments, but rather construed according to the below claims.

Additional information regarding a Ultra Fast Interface (UFI) is set forth in related co-pending application Number ______, entitled Granular Dynamic Test Systems and Methods (Attorney Docket No. NVID-PSC-0129US1) filed on Oct. 27, 2016, which is incorporated herein by reference.

Additional information regarding DSTA is set forth in related co-pending application Number ______, entitled Dynamic Scan Test Access (DSTA) (Attorney Docket No. NVID-PSC-0131US1) filed on Oct. 27, 2016, which is incorporated herein by reference.

Additional information regarding coordination of external pad configuration is set forth in related co-pending application Number ______, entitled Test Partition External Input/Output Interface Control (Attorney Docket No. NVID-PSC-0132US1) filed on Oct. 27, 2016, which is incorporated herein by reference.

Additional information regarding on-line test operations and JTAG test operations is set forth in related co-pending application Number ______, entitled Scan System Interface (SSI)

(Attorney Docket No. NVID-PSC-0134US1) filed on Oct. 27, 2016, which is incorporated herein by reference.

Additional information regarding free running clock and independent test partition clock coordination is set forth in related co-pending application Number ______, entitled Dynamic Independent Test Partition Clocks (Attorney Docket No. NVID-PSC-0142US1) filed on Oct. 27, 2016, which is incorporated herein by reference.

Additional information regarding test partition clock staggering and peak power reduction is set forth in related co-pending application Number ______, entitled Independent Test Partition Clock Coordination Across Multiple Test Partitions (Attorney Docket No. NVID-PSC-0147US1) filed on Oct. 27, 2016, which is incorporated herein by reference. 

What is claimed:
 1. A computer system comprising: a processor; and memory coupled to said processor and having stored therein instructions that, if executed by said computer system, cause said computer system to execute a method for testing comprising: sending an instruction to a JTAG controller to select a first internal test data register of a plurality of data registers; programming said first internal test data register to configure mode control access and state control access for a test controller implementing a sequential scan architecture at a system level.
 2. The method of claim 1, wherein said programming said first internal test data register comprises: programming a mode/state control bit to a first value in said first internal test data register to indicate mode control access during a mode write phase; and programming mode values in said first internal test data register to indicate a test mode and a corresponding register to which input data received over a JTAG scan-in interface is stored in a subsequent state control access.
 3. The method of claim 1, wherein said programming said first internal test data register comprises: programming a mode/state control bit to a second value in said first internal test data register to indicate a state control access during a state write phase; receiving state control signals received over a JTAG scan-in interface during said state control access, and storing said state control signals in a plurality of dynamic state control registers; receiving input data over a JTAG scan-in interface and storing said input data in a corresponding register decoded from mode values programmed in a previous mode access, wherein storing of said input data is controlled by said state control signals.
 4. The method of claim 3, further comprising: resetting said mode/state control bit to a first value to allow writes to said first internal test data register in a subsequent mode write phase.
 5. The method of claim 1, further comprising: disabling production ATE testing.
 6. The method of claim 1, further comprising; generating control signals using a JTAG/IEE 1500 interface when performing said programming of said first internal test data register.
 7. The method of claim 1, further comprising: alternating between a mode write phase and a state write phase when programming said first internal test data register that is configured receiving mode control signals and state controls signals without accessing an instruction register.
 8. The method of claim 1, further comprising: implementing a dynamic standard test access (DSTA) module to align input data received over said JTAG interface clocked at a first clock with said test controller that is clocked at a slower, second clock.
 9. A non-transitory computer-readable medium having computer-executable instructions for causing a computer system to perform a method for discovering wireless access comprising: sending an instruction to a JTAG controller to select a first internal test data register of a plurality of data registers; programming said first internal test data register to configure mode control access and state control access for a test controller implementing a sequential scan architecture at a system level.
 10. The computer-readable medium of claim 9, wherein said programming said first internal test data register in said method comprises: programming a mode/state control bit to a first value in said first internal test data register to indicate mode control access during a mode write phase; and programming mode values in said first internal test data register to indicate a test mode and a corresponding register to which input data received over a JTAG scan-in interface is stored in a subsequent state control access.
 11. The computer-readable medium of claim 9, wherein said programming said first internal test data register in said method comprises: programming a mode/state control bit to a second value in said first internal test data register to indicate a state control access during a state write phase; receiving state control signals received over a JTAG scan-in interface during said state control access, and storing said state control signals in a plurality of dynamic state control registers; receiving input data over a JTAG scan-in interface and storing said input data in a corresponding register decoded from mode values programmed in a previous mode access, wherein storing of said input data is controlled by said state control signals.
 12. The computer-readable medium of claim 11, wherein said method further comprises: resetting said mode/state control bit to a first value to allow writes to said first internal test data register in a subsequent mode write phase.
 13. The computer-readable medium of claim 9, wherein said method further comprises: disabling production ATE testing.
 14. The computer-readable medium of claim 9, wherein said method further comprises: generating control signals using a JTAG/IEE 1500 interface when performing said programming of said first internal test data register.
 15. The computer-readable medium of claim 9, wherein said method further comprises: alternating between a mode write phase and a state write phase when programming said first internal test data register that is configured receiving mode control signals and state controls signals without accessing an instruction register.
 16. The computer-readable medium of claim 9, wherein said method further comprises: implementing a dynamic standard test access (DSTA) module to align input data received over said JTAG interface clocked at a first clock with said test controller that is clocked at a slower, second clock.
 17. A method for testing, comprising: sending a single instruction over a JTAG interface to a JTAG controller to select a first internal test data register of a plurality of data registers; programming said first internal test data register using said JTAG interface to configure mode control access and state control access for a test controller implementing a sequential scan architecture to test a chip at a system level.
 18. The method of claim 17, wherein said programming said first internal test data register comprises: programming a mode/state control bit to a first value in said first internal test data register to indicate mode control access during a mode write phase; and programming mode values in said first internal test data register to indicate a test mode and a corresponding register to which input data received over a JTAG scan-in interface is stored in a subsequent state control access.
 19. The method of claim 17, wherein said programming said first internal test data register comprises: programming a mode/state control bit to a second value in said first internal test data register to indicate a state control access during a state write phase; receiving state control signals received over a JTAG scan-in interface during said state control access, and storing said state control signals in a plurality of dynamic state control registers; receiving input data over a JTAG scan-in interface and storing said input data in a corresponding register decoded from mode values programmed in a previous mode access, wherein storing of said input data is controlled by said state control signals.
 20. The method of claim 19, further comprising: resetting said mode/state control bit to a first value to allow writes to said first internal test data register in a subsequent mode write phase. 